Toward data mining engineering: A software engineering approach

نویسندگان

  • Óscar Marbán
  • Javier Segovia
  • Ernestina Menasalvas Ruiz
  • Covadonga Fernández-Baizán
چکیده

The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard. & 2008 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Data Mining Techniques on Software Engineering Data for Software Quality

The processes of Software engineering are complex and produces large number and variety of artifacts. The potential of data mining technique on this large valuable data is to better manage the software projects and to produce high-quality software systems that are delivered on time and within budget. This paper present the latest research in mining software engineering data, software engineerin...

متن کامل

Presented a method for estimating the cost of software using PCA to reduce the size and with the help of data mining

  These days, data mining one of the most significant issues. One field data mining is a mixture of computer science and statistics which is considerably limited due to increase in digital data and growth of computational power of computer. One of the domains of data mining is the software cost estimation category. In this article, classifying techniques of learning algorithm of machine ...

متن کامل

Using Data Mining For Automated Software Testing

In today’s software industry, the design of test cases is mostly based on human expertise, while test automation tools are limited to execution of pre-planned tests only. Evaluation of test outcomes is also associated with a considerable effort by human testers who often have imperfect knowledge of the requirements specification. Not surprisingly, this manual approach to software testing result...

متن کامل

An Engineering Approach to Data Mining Projects

Both the number and complexity of Data Mining projects has increased in late years. Unfortunately, nowadays there isn’t a formal process model for this kind of projects, or existing approaches are not right or complete enough. In some sense, present situation is comparable to that in software that led to ’software crisis’ in latest 60’s. Software Engineering matured based on process models and ...

متن کامل

repeatable experiments in software engineering

Welcome to the special issue of Empirical Software Engineering on repeatable experiments in software engineering. Earlier and shorter versions of the papers presented here first appeared at the PROMISE 2007 workshop in Minneapolis. The PROMISE project has been running for 4 years now and aims to create large libraries of repeatable experiments in software engineering. PROMISE is somewhat differ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2009